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Abstract 


Special 


Several  procedures  have  been  studied  to  select  the  best  among  a  set 
of  new  treatments  (populations)  which  are  better  than  a  standard  (or  control) 
using  two-stage  procedures  for  the  case  of  normal  populations.  One  such 
procedure  is  to  select  the  best  based  on  the  confidence  intervals  with  a 
specified  fixed  width  2d  after  eliminating  those  populations  which  are  worse 
than  the  standard  based  on  the  expected  posterior  losses.  Several  papers  deal 


with  this  kind  of  problem  but  none  of  them  is  based  on  the  so-called  100(l-2a)% 

/  r  .Jk/J 

Highest  Posterior  Density  (HPD)  credible  regions,  which  are  conceptually  /LM 


equivalent  to  the  confidence  intervals,  with  a  fixed  width  2d.  After  retaining 


good  populations  based  on  the  expected  posterior  losses,  we  set  up  a  stopping 


rule  Ni  for  constructing  the  HPD  credible  region  for  each  selected  population, 
which  is  asymptotically  efficient  and  consistent.  Thereafter,  we  develop 

Ac 


several  different  decision  criteria  based  on  the  whole  samples  or  the  HPD 


credible  regions.  For  applications,  we  use  a  noninformatlve  prior  for  the 

2 


unknown  means  0  and  unknown  variances  0  of  normal  populations,  which  might 
lead  to  robustness:  Here  we  use  0  -  k^  losses  at  Stage  1  and  a  stopping  rule 
Ni  which  provides  a  100(l-2a)%  HPD  credible  region  for  each  selected  population 
with  a  fixed  width  2d  to  decide  on  the  choice  of  the  best  population  based 
on  the  overall  sample  means  at  Stage  2. 
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1 .  Introduction 

Since  the  early  work  of  Bechhofer,  Dunnett  and  Sobel  (1954)  on  the 
two-sample  (two-stage)  problem  for  selecting  the  normal  population  associated 
with  the  largest  unknown  mean  from  k(>^  2)  normal  populations,  several  different 
two-stage  procedures  have  been  studied  for  the  following  three  cases:  (i) 
known  variances,  (ii)  cornnon  unknown  variances  and  (iii)  unknown  and  unequal 
variances. 

Among  these  different  procedures,  there  are  mainly  two  types:  (a) 
elimination  type  rules  which  select  a  subset  by  eliminating  some  non-best 
populations  at  the  first  stage  and  take  additional  samples  according  to  the 
sampling  scheme  based  on  some  design  criteria  to  decide  on  the  choice  of  the 
best  at  the  second  stage,  and  (b)  nonelimination  type  procedures  for  which 
one  decides  on  sample  sizes  at  the  first  stage  and  then  takes  additional 
samples  on  all  populations  so  as  to  decide  on  the  selection  of  the  best. 

Most  of  these  procedures  use  the  so-called  indifference  zone  approach 
introduced  by  Bechhofer  (1954),  and  especially  for  the  elimination  type 
procedures,  at  the  first  stage  subset  selection  procedures  are  used:  this 
approach  was  introduced  by  Gupta  (1956). 

For  the  elimination  type  procedures,  Alam  (1970)  studied  the  known 
variances  case.  Tamhane  and  Bechhofer  (1977,  1979),  using  a  minimax  criterion, 
also  studied  the  known  variances  case.  Gupta  and  Kim  (1982)  and  Tamhane 
(1975)  have  considered  the  common  unknown  variances  case. 

For  the  nonelimination  type  rules,  Bechhofer,  Dunnett  and  Sobel  (1954) 
have  studied  the  common  unknown  variances  case  and  Dudewicz  and  Dalai  (1975), 

Rinott  (1978),  Bofinger  (1979)  and  toikhopadhyay  (1979)  have  considered  the  problem  for 
the  unknown  and  unequal  variances  case.  Recently  Gupta  and  Miescke  (1981,  1982), 
among  others,  have  studied  the  problem  under  a  decision-theoretic  Bayesian 
framework. 
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In  this  paper,  we  propose  an  elimination  type  procedure  with  Bayesian 
setting,  which  retains  good  populations  based  on  the  expected  posterior  loss. 
We  use  certain  loss  functions  and  prior  distributions.  We  also  use  a  stopping 
rule  to  construct  the  100(l-2a)5!  Highest  Posterior  Density  (HPD)  credible 
region,  which  is  equivalent  to  a  100(l-2a)%  confidence  interval  conceptually, 
with  a  fixed  width  2d.  Then  we  decide  on  the  selection  of  the  best  based 
on  some  criteria. 

For  an  application  of  this  procedure,  we  use  a  0  •  k.  type  loss  function 
and  a  nonin formative  prior  for  unknown  parameters  and  select  the  best  based  on 
the  overall  sample  means. 


2.  General  Framework  for  the  Proposed  Procedure  R(a,d). 

Let  n.,  i  =  !,2,...,k,  be  k  normal  populations  with  unknown  mean  and 
2  2 

unknown  variances  (0  <  <  »).  Also  let  be  the  (observable)  characters 

2 

tic  associated  with  tt.  and  let  its  probability  density  function  be  ffx^e^,  o.) 

For  i  =  l,2,...,k,  let  x^  *  (x^,...,x^n  )  be  n^  realizations  of  the  random 

2  2 
variable  X. .  Let  *(6^,  o^)  be  a  prior  distribution  of  (e^,o.)  which  is 

o 

absolutely  continuous.  Then  if  t(0^,o^|x^)  is  the  posterior  distribution  of 
2 

(e.,0^),  then  by  definition. 


(2.1) 

where 


1 

t(01,o^)  n  f(xij|0i,o^) 

T(01»ffll5i)  *  mfx^J 


(2.2) 


m(x1 )  -  //t( ©i ,a*)  n  f(x1J|01.oJ)de1doJ. 


Also  the  marginal  posterior  distributions  of  and  a*  can  be  obtained  by 
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(2.3)  T](0il$j)  =  fT(ei »0f|Xf)dof» 
and 

(2.4)  t g ( cr ^ )  =  /t ( 0^  ,o^  | ) de^ . 

it ^  is  said  to  be  good  if  6  (c,»),  where  c  is  a  constant  specified 
'a  priori'  by  the  experimenter.  Then  our  loss  structure  is  as  follows: 


(2.5) 


L(erap)  = 


0  if  €  0p.  P  -  0,1 

^(c.e^  if  e.  €  Q-0p, 


where  ®  =  Ir\  ®q  =  (c,»),  and  where  the  action  space  G  =  {a^.a^}.  Here 
the  action  aQ  accepts  w..  as  a  good  population  and  the  action  aj  rejects 
as  a  non-best  (non-contending)  population. 


Definition  (see  Berger  (1980)).  The  100(l-2o)%  HPD  credible  region  for  ei 
is  the  subset  C.|(l-2a)  of  0  of  the  form 

(2.6)  C^(l-2a)  *  {ei  €  0;  >  k(2a)), 

where  k(2a)  is  the  largest  constant  such  that 

(2.7)  Pr(C^(l-2a) |X^  *  x ^ )  >  l-2a. 


Remark:  If  T i ( ^ ^  ( ?-j )  not  unimodal,  the  credible  region  C^(l-2a) 
consists  of  several  disjoint  Intervals.  To  avoid  this  kind  of  complexity, 
we  assume  here  that  i|(e^ [x^ )  is  (strongly)  unimodal.  Let  C^(l-2o)  i  (a^.b^) 
Then  C^(l-2a)  can  be  constructed  by  the  following  equations. 

(2.8)  T](a| Ijj)  *  Ti(b^ l?^)» 

b1 

/  ~  l“2o, 

*1 


(2.9) 
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Hence,  now  we  can  set  up  a  stopping  rule  for  each  selected  at 
Stage  1  as  follows: 

At  Stage  2,  take  additional  observations  from  each  selected  such 


(2.14)  Nj  =  inf(n;  n  >_  Oq  and  n  _>  n(x^  ,<*,d) }, 

where  n(x^,a,d)  is  the  solution  of  the  equation  (2.13)  and  n(a,d)  which 

is  used  at  Stage  1  is  decided  based  on  the  known  proportion  of  lim  n(x- ,a,d). 

n.-*« 

Then  two  possible  orders  are 


(2.15) 


a[l]'a[2]  a[s]> 


(2.16) 


b[l]  “  b[2]  -  *  -  b[s]‘ 


where  s  is  the  size  of  the  subset  S  selected  at  Stage  1.  Hence  we  can  denote 
(2.17)  C(1)(l-2a)  <  C(2)(l-2a)  <. . . <  C(s)(l-2ct) 


or 

(2.18)  *(1)  -  "(2)  -  •*-  ff(s) 

corresponding  to  the  order  (2.15)  or  (2.16).  Then  we  can  select  populations 
based  on  the  following  two  decisions. 


Decision  A.  If  we  define  the  population  associated  with  the 
credible  region  C^sj(l-2a)  the  best,  then  select  n(i)*1,(i+i)»,,*»1T{s)  correspond¬ 
ing  to  C^j(l-2a),  C^+jj(l-2a),...,  C^$j(l-2o),  where  i  is  the  first  j  such  that 

(2* 19>  b[j]  -  a[s]  +  d0* 

where  d^  is  defined  by  a  suitable  condition  on  the  minimum  probability  of  a 
correct  selection  (PCS). 
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Decision  B.  If  the  population  ir^  with  the  largest  unknown  mean  among 
good  ones  is  defined  as  the  best,  then  the  selection  procedure  as  follows: 

1  "j 

Select  it-  corresponding  to  X.  =  max  jj-  T  X... 

1  1  7fj€S  "j  t=l 


Remark.  The  minimum  PCS  may  be  Invoked  depending  upon  the  type  of 
the  decision.  Also  the  minimum  PCS  can  vary  depending  on  the  type  of  the 
problem. 


3.  An  Application  of  the  Procedure  R(g,d). 

Let  it-,  i  =  l,2,...,k  be  k  normal  populations  with  unknown  mean  e-  and 
2  2 

unknown  variance  a-  (0  <  <  »)  and  let  X^j,  j  =  1,2,... ,n0:  i  *  1,2,. ..,k 

be  nQ  independent  samples  from  a  partition  ^ ,  where  nQ  is  defined  later.  We 
define  a  population  to  be  good  if  €  (c,«),  where  c  is  a  constant  a  priori 
specified  by  the  experimenter.  Our  goal  is  to  select  the  population  associated 
with  the  largest  e..  among  good  populations.  Let  our  loss  be  as  follows: 


L(e1 ,ap)  = 


0  if  ei  6  6p ,  p  =  0,1, 

kp  if  ei  6  0-0p, 


where  8g  *  {e^;  e  6  (c,»)},  e  »  F  ^  and  the  action  8g  accepts  *.  as  good  and 

the  action  a^  rejects  ^  as  not  good.  We  are  going  to  use  a  noninformative 

2 

prior  distribution  t(9^,  c^),  where 
(3.2)  t(e^,o^)  s  oi  ^o,«)^°i^ ’ 

where  !(•)  Is  a  usual  Indicator  function.  The  preceding  prior,  in  some 
sense,  provides  robustness. 
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StgSgJL*  Let  nQ  *  max{2,  C^1^]  +  1},  where  2d  Is  the  fixed  width 
of  the  100(l-2a)%  HPD  credible  region  which  will  be  set  up  at  Stage  2,  and 

z(p)  is  100*P*  (upper)  percentile  of  the  standard  normal  distribution.  Let 

n0 

xij  be  a  realization  of  and  x,  *  (xn . xin  )  and  x.  =  J  x^./n^ 

By  definition,  the  marginal  posterior  distribution  Tjfe^Xj)  of  Is  a 
Student's  t-dlstrlbutlon  with  (nQ-l)  degrees  of  freedom,  the  location 


"o 


parameter  xv  and  the  scale  parameter  }  (Xfj-x,)  'Vn0' 1J- 
L  J  1 

Titei i?t ) 


Then 


(3.3) 

and 


(L(ei  ,aQ) )  =  kQprte^^) 


T1(0i|x-) 

E  (L(ei  ,a1 ) )  =  kjPrfeglXj).. 


(3.4) 

Thus,  at  Stage  1,  we  retain  ir^  Iff 

(3.5)  koPrCQ^Xi)  <_  kiPr(e0|x1), 
or,  equivalently,  we  retain  ni  iff 

(3.6)  Pr(9)|x,) 

For  an  explicit  explanation  for  Pr(0j-|Xf),  see  the  following  Lemma  1, 


Lemma  1. 

1  '  7  JuH?  '*  i)  if  c~*i  -  °* 

l  dn-l  l 

7  Vi? — *  7^  ^  c”*l  < 

where  Ix(a,b)  Is  an  Incomplete  Beta- function,  u  »  (np-IWiyl+t2), 


Pr(©i Ix1 )  *  /  dF 


Tl(0ll?i) 
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t  = 


(c-XfVfSj/tffi),  and  s?  = 


^xi j”*i ^  /(n0_1)* 


Let  S  be  the  selected  subset  of  Stage  1  and  let  s  be  its  size.  Then 

(i)  if  s  =  0,  we  decide  that  none  of  populations  is  good  and  stop, 

(ii)  if  s  =  1,  we  decide  that  the  population  selected  is  the  only  good  one 
and  the  best  at  the  same  time  and  stop, 

(iii)  if  s  >  2*  we  proceed  to  Stage  2. 


Stage  2.  Now  we  want  to  set  up  a  100(1 -2a) %  HPD  credible  region 
0.(1-200  for  6^  of  each  population  selected  at  Stage  1  with  a  common 
fixed  width  2d. 


A  Procedure  for  constructing  the  credible  region  (l-2a). 

_  o 

Let  gfQj |n-l,  x^ ,  s^/n)  be  the  pdf  of  a  Student's  t-distribution  with 

(n-1)  degrees  of  freedom,  the  location  parameter  x^  and  the  scale  parameter 
2  2 

Sf/n,  where  x^  and  s^  are  defined  the  same  as  before.  Let  C.(l-2a)  =  (a^.b-). 

2 

Then  since  gte^n-l.x^,  s^/n)  is  strongly  unimodal  and  symnetric  about  xi ,  the 
following  two  equations  provide  the  credible  region  C..(l-2a). 

(3.7)  g(ai  |n-l,  xi ,  st/n)  =  gfb^n-l,  xt ,  s^/n) 
and 

bi 

(3.8)  /  g( o1 1 n-1 ,  x.,  s?/  n)de.  =  l-2a. 
ai 

Transform  =  ^(e^-x^ )/s^  and  by  the  equations  (3.7)  and  (3.8), 

(3.9)  a^  +  b^  =  2x^ 


and 
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(3.10) 


ai'xi  bi*xi 

Pr{a.  <o.  <  b. }  =  Pr{— — -  <  £.  <  — — -} 
1  1  1  s./v^n  1  Si/^n 


xi'ai 

=  PHI^I  <  -1-— =1 

1  s.//n 

=  i-yV*  b 


=  1  -  2a  . 

Then  from  the  table  of  an  incomplete  Beta-function  (e.g.  Pearson  (1934)), 
we  can  get  the  100*a  upper  percentile  point  Cg  of  the  beta  distribution. 
Hence  by  Lemma  1, 


(3.11) 


u2a  =  c0  = 


(n-l)+t‘ 


(xj-ai Y 
(n-1)  +- - ^—L- 


Ji(x.j-x.)Vn(n-l) 


n-1  2 

l  /n 

j=l  1J  1 


n  (x-.-x.y 


+(xi-ai)‘ 


(3*12>  ai  =  xi  -  Z^-1 


h  (x1 1-xi )' 

j=l  1 


(3.13)  b 


, .  s,  -/R 


'w . -  - 


n 


Therefore  the  width  2d  of  the  credible  region  CL(l-2a)  is 


(3.14) 


2d 


.2/TT7  hi i  ,J  1 

V  c„  /  n 


and  this  implies  that 


<r-’>  i' 

c0  j=l  13  1 


(3.15) 


n  = 


Hence  a  stopping  rule  N.  which  provides  a  100(l-2a)%  HPD  credible  region 
C.  (1 -2a)  with  a  fixed  width  2d  is  given  by 


(3.16) 


Ni  =  inf{n;  n  >_  nQ 


and  n  >_  [• 


1)  l  (xii-^) 

j=l  13  1 


2 

-3+1 } , 


where  [a]  is  the  largest  integer  less  than  or  equal  to  a.  Note  that 
after  we  stop  sampling,  the  marginal  posterior  distribution  -r^(e^|x.)  becomes 

a  Students's  t-distribution  with  (N.-l)  degrees  of  freedom,  the  location 

Ni  Ni  .  2 

parameter  x.  =  F  x.  ./N-  and  the  scale  parameter  F  (x. _.-x. )  /N. (N.-l). 

1  j=l  13  1  j=l  13  1  1  1 

At  Stage  2,  then  we  decide  the  population  associated  with  the  largest  overall 

sample  mean  to  be  the  best.  That  is,  it associated  with  X[sj>  where 

Xn]  ^  Xj-2]  <.•••£  Xj-s-j  are  ordered  means  of  usual  sample  means  X^,  is  said  to 

be  the  best  among  good  ones. 


Lemma  2.  /n-1  (4 — l)2 
c0 


”Z(a)  =  Z( 1 -a)  dS  ° 


Proof.  The  proof  follows  from  the  central  limit  theorem. 
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